Goto

Collaborating Authors

 Educational Setting


Learning the Linear Quadratic Regulator from Nonlinear Observations Zakaria Mhammedi Dylan J. Foster Max Simchowitz ANU and Data61 MIT UC Berkeley

Neural Information Processing Systems

To enable sample-efficient learning, we assume the learner has access to a class of decoder functions (e.g., neural networks) that is flexible enough to capture the mapping from observations to latent states.


MESA: Boost Ensemble Imbalanced Learning with MEta-SAmpler

Neural Information Processing Systems

Imbalanced learning (IL), i.e., learning unbiased models from class-imbalanced data, is a challenging problem. Typical IL methods including resampling and reweighting were designed based on some heuristic assumptions. They often suffer from unstable performance, poor applicability, and high computational cost in complex tasks where their assumptions do not hold.


Fast geometric learning with symbolic matrices Jean Feydy * Joan Alexis Glaunès * Imperial College London

Neural Information Processing Systems

Geometric methods rely on tensors that can be encoded using a symbolic formula and data arrays, such as kernel and distance matrices. We present an extension for standard machine learning frameworks that provides comprehensive support for this abstraction on CPUs and GPUs: our toolbox combines a versatile, transparent user interface with fast runtimes and low memory usage. Unlike general purpose acceleration frameworks such as XLA, our library turns generic Python code into binaries whose performances are competitive with state-of-the-art geometric libraries - such as FAISS for nearest neighbor search - with the added benefit of flexibility. We perform an extensive evaluation on a broad class of problems: Gaussian modelling, K-nearest neighbors search, geometric deep learning, non-Euclidean embeddings and optimal transport theory.


Assouad, Fano, and Le Cam with Interaction: A Unifying Lower Bound Framework and Characterization for Bandit Learnability Fan Chen Dylan J. Foster Yanjun Han MIT Microsoft Research New York University

Neural Information Processing Systems

We develop a unifying framework for information-theoretic lower bound in statistical estimation and interactive decision making. Classical lower bound techniques--such as Fano's method, Le Cam's method, and Assouad's lemma-- are central to the study of minimax risk in statistical estimation, yet are insufficient to provide tight lower bounds for interactive decision making algorithms that collect data interactively (e.g., algorithms for bandits and reinforcement learning). Recent work of Foster et al. [40, 42] provides minimax lower bounds for interactive decision making using seemingly different analysis techniques from the classical methods. These results--which are proven using a complexity measure known as the Decision-Estimation Coefficient (DEC)--capture difficulties unique to interactive learning, yet do not recover the tightest known lower bounds for passive estimation. We propose a unified view of these distinct methodologies through a new lower bound approach called interactive Fano method. As an application, we introduce a novel complexity measure, the Fractional Covering Number, which facilitates the new lower bounds for interactive decision making that extend the DEC methodology by incorporating the complexity of estimation. Using the fractional covering number, we (i) provide a unified characterization of learnability for any stochastic bandit problem, (ii) close the remaining gap between the upper and lower bounds in Foster et al. [40, 42] (up to polynomial factors) for any interactive decision making problem in which the underlying model class is convex.



Fast Rates for Bandit PAC Multiclass Classification

Neural Information Processing Systems

We study multiclass PAC learning with bandit feedback, where inputs are classified into one of possible labels and feedback is limited to whether or not the predicted labels are correct.


Equal Opportunity in Online Classification with Partial Feedback

Neural Information Processing Systems

We study an online classification problem with partial feedback in which individuals arrive one at a time from a fixed but unknown distribution, and must be classified as positive or negative. Our algorithm only observes the true label of an individual if they are given a positive classification. This setting captures many classification problems for which fairness is a concern: for example, in criminal recidivism prediction, recidivism is only observed if the inmate is released; in lending applications, loan repayment is only observed if the loan is granted. We require that our algorithms satisfy common statistical fairness constraints (such as equalizing false positive or negative rates -- introduced as "equal opportunity" in [18]) at every round, with respect to the underlying distribution. We give upper and lower bounds characterizing the cost of this constraint in terms of the regret rate (and show that it is mild), and give an oracle efficient algorithm that achieves the upper bound.


Universal Rates for Active Learning

Neural Information Processing Systems

In this work we study the problem of actively learning binary classifiers from a given concept class, i.e., learning by utilizing unlabeled data and submitting targeted queries about their labels to a domain expert. We evaluate the quality of our solutions by considering the learning curves they induce, i.e., the rate of decrease of the misclassification probability as the number of label queries increases. The majority of the literature on active learning has focused on obtaining uniform guarantees on the error rate which are only able to explain the upper envelope of the learning curves over families of different data-generating distributions. We diverge from this line of work and we focus on the distribution-dependent framework of universal learning whose goal is to obtain guarantees that hold for any fixed distribution, but do not apply uniformly over all the distributions. We provide a complete characterization of the optimal learning rates that are achievable by algorithms that have to specify the number of unlabeled examples they use ahead of their execution. Moreover, we identify combinatorial complexity measures that give rise to each case of our tetrachotomic characterization. This resolves an open question that was posed by Balcan et al. (2010). As a byproduct of our main result, we develop an active learning algorithm for partial concept classes that achieves exponential learning rates in the uniform setting.


Scalable Early Childhood Reading Performance Prediction Zanming Huang 1

Neural Information Processing Systems

Models for student reading performance can empower educators and institutions to proactively identify at-risk students, thereby enabling early and tailored instructional interventions. However, there are no suitable publicly available educational datasets for modeling and predicting future reading performance. In this work, we introduce the Enhanced Core Reading Instruction (ECRI) dataset, a novel largescale longitudinal tabular dataset collected across 44 schools with 6,916 students and 172 teachers. We leverage the dataset to empirically evaluate the ability of state-of-the-art machine learning models to recognize early childhood educational patterns in multivariate and partial measurements. Specifically, we demonstrate a simple self-supervised strategy in which a Multi-Layer Perception (MLP) network is pre-trained over masked inputs to outperform several strong baselines while generalizing over diverse educational settings. To facilitate future developments in precise modeling and responsible use of models for individualized and early intervention strategies, our data and code are available at https://ecri-data.github.io/.


A Benchmark Task Details

Neural Information Processing Systems

The risk for lead exposure is disproportionately higher for children who are poor, non-Hispanic black, living in large metropolitan areas, or living in older housing. The CDC sets a national standard for blood lead levels in children. This value was established in 2012 to be 3.5 micrograms per decileter (µg/dL) of blood.